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Abstract 

Modern smartphones contain motion sensors, such as ac¬ 
celerometers and gyroscopes. These sensors have many 
useful applications; however, they can also be used to 
uniquely identify a phone by measuring anomalies in the 
signals, which are a result from manufacturing imper¬ 
fections. Such measurements can be conducted surrep¬ 
titiously in the browser and can be used to track users 
across applications, websites, and visits. 

We analyze techniques to mitigate such device finger¬ 
printing either by calibrating the sensors to eliminate the 
signal anomalies, or by adding noise that obfuscates the 
anomalies. To do this, we first develop a highly accurate 
fingerprinting mechanism that combines multiple motion 
sensors and makes use of (inaudible) audio stimulation to 
improve detection. We then collect measurements from a 
large collection of smartphones and evaluate the impact 
of calibration and obfuscation techniques on the classi¬ 
fier accuracy. 

1 Introduction 

Smartphones are equipped with motion sensors, such as 
accelerometers and gyroscopes, that are available to ap¬ 
plications and website and enable a variety of novel uses. 
These same sensors, however, can threaten user privacy 
by enabling sensor fingerprinting. Manufacturing imper¬ 
fections result in each sensor having unique characteris¬ 
tics in their produced signal. These characteristics can 
be captured in the form of a fingerprint and used to track 
users across repeat visits. The sensor fingerprint can 
be used to supplement or replace other privacy-invasive 
tracking technologies, such as cookies, or canvas finger¬ 
printing [44]. Since the fingerprint relies on the physi¬ 
cal characteristics of a particular device, it is immune to 
defenses such as clearing cookies and private browsing 
modes. 

We carry out a detailed investigation the feasibility of 


fingerprinting of motion sensors in smartphones. Practi¬ 
cal fingerprinting faces several challenges. During a typ¬ 
ical web browsing session, a smart phone is either held in 
a user’s hand, resulting in noisy motion inputs, or is rest¬ 
ing on a flat surface, minimizing the amount of sensor 
input. Additionally, web APIs for accessing motion sen¬ 
sor data have significantly lower resolution than is avail¬ 
able to the operating systems and applications. We show 
that, using machine learning techniques, it is possible to 
combine a large number of features from both the ac¬ 
celerometer and gyroscope sensor streams and produce 
highly accurate classification despite these challenges. 
In some cases, we can improve the classifier accuracy 
by using an inaudible sound, played through the speak¬ 
ers, to stimulate the motion sensors. We evaluate our 
techniques in a variety of lab settings; additionally, we 
collected data from volunteer participants over the web, 
capturing a wide variety of smartphone models and oper¬ 
ating systems. In our experiments, a web browsing ses¬ 
sion lasting under a minute is still sufficient to generate a 
fingerprint that can be used in to recognize the phone in 
the future. 

We next investigate two potential countermeasures to 
sensor fingerprinting. First, we consider the use of cal¬ 
ibration to eliminate some of the error that results from 
manufacturing imperfections. Promisingly, we find that 
calibrating the accelerometer is easy and has a significant 
impact on classification accuracy. Gyroscope calibration, 
however, is more challenging without specialized equip¬ 
ment, and attempts to calibrate the gyroscope by hand do 
not result in an effective countermeasure. 

An alternative countermeasure is obfuscation, which 
introduces additional noise to the sensor readings in the 
hopes of hiding the natural errors. Obfuscation has the 
advantage of not requiring a calibration step; we find 
that by adding noise that is similar in magnitude to the 
natural errors that result from manufacturing, we can re¬ 
duce the accuracy of fingerprinting more effectively than 
by calibration. We also investigate the possibility of us- 


ing higher magnitude noise, as well as adding temporal 
disturbances to obfuscate frequency domain features. At 
high levels of noise, hngerprinting accuracy is greatly re¬ 
duced, though such noise is likely to impair the utility of 
motion sensors. 

Roadmap. The remainder of this paper is organized 
as follows. We present background information and re¬ 
lated works in Section 2. In Section 3, we briefly discuss 
why accelerometers and gyroscopes can be used to gen¬ 
erate unique hngerprints. In Section 4, we describe the 
different temporal and spectral features considered in our 
experiments, along with the classihcation algorithms and 
metrics used in our evaluations. We present our hnger¬ 
printing results in Section 5. Section 6 describes our 
countermeasure techniques to sensor hngerprinting. We 
briehy discuss some deployment considerations in Sec¬ 
tion 7. Finally, we conclude in Section 8. 

2 Fingerprinting Background 

Human hngerprints, due to their unique nature, are a 
very popular tool used to identify people in forensic and 
biometric applications [25,51]. Researchers have long 
sought to hnd an equivalent of hngerprints in computer 
systems by hnding characteristics that can help identify 
an individual device. Such hngerprints exploit variation 
in both the hardware and software of devices to aid in 
identihcation. 

As early as 1960, the US government used unique 
transmission characteristics to track mobile transmit¬ 
ters [36]. Later, with the introduction of cellular net¬ 
work researchers were able to successfully distinguish 
transmitters by analyzing the spectral characteristics of 
the transmitted radio signal [50]. Researchers have 
suggested using radio-frequency hngerprints to enhance 
wireless authentication [38, 45], as well as localiza¬ 
tion [48]. Others have leveraged the minute manufac¬ 
turing imperfections in network interface cards (NICs) 
by analyzing the radio-frequency of the emitted sig¬ 
nals [22,31]. Computer clocks have also been used for 
hngerprinting: Moon et al. showed that network devices 
tend to have a unique and constant clock skews [42]; 
Kohno et al. exploited this to uniquely distinguish net¬ 
work devices through TCP and ICMP timestamps [35]. 

Software can also serve as a distinguishing feature, 
as different devices have a different installed software 
base. Researchers have long been exploiting the dif¬ 
ference in the protocol stack installed on IEEE 802.11 
compliant devices. Desmond et al. [27] have looked at 
distinguishing unique devices over Wireless Local Area 
Networks (WLANs) simply by performing timing analy¬ 
sis on the 802.11 probe request packets. Others have in¬ 
vestigated subtle differences in the hrmware and device 


drivers running on IEEE 802.11 compliant devices [30]. 
802.11 MAC headers have also been used to uniquely 
track devices [32]. Moreover, there are well-known open 
source toolkits like Nmap [39] and Xprobe [56] that can 
remotely hngerprint an operating system by analyzing 
unique responses from the TCP/IP networking stack. 

Browser Fingerprinting A common application of 
hngerprinting is to track a user across multiple visits to 
a website, or a collection of sites. Traditionally, this 
was done with the aid of cookies explicitly stored by 
the browser. However, privacy concerns have prompted 
web browsers to implement features that clear the cookie 
store, as well as private browsing modes that do not store 
cookies long-term. This has prompted site operators to 
develop other means of uniquely identifying and tracking 
users. Eckersley’s Panopticon project showed that many 
browsers can be uniquely identihed by enumerating in¬ 
stalled fonts and other browser characteristics, easily ac¬ 
cessible via JavaScript [29]. A more advanced technique 
uses HTML5 canvas elements to hngerprint the fonts 
and rendering engines used by the browser [44] . Others 
have proposed the use of performance benchmarks for 
differentiating between JavaScript engines [43]. Lastly, 
browsing history can to used to prohle and track on¬ 
line users [47]. Numerous studies have found evi¬ 
dence of these and other techniques being used in the 
wild [19,20,46]. A number of countermeasures to these 
techniques exist; typically they disable or restrict the 
ability of a website to probe the characteristics of a web 
browser. We expect that smartphones are less suscep¬ 
tible to browser hngerprinting due to a more integrated 
hardware and software base resulting in less variability, 
though we are unaware of an exploration of smartphone 
browser hngerprinting. 

Sensor Fingerprinting Smartphones do, however, 
possess an array of sensors that can be used to hnger¬ 
print them. Two studies have looked at hngerprinting 
smartphone microphones and speakers [26,57]. These 
techniques, however, require access to the microphone, 
which is typically controlled with a separate permission 
due to the obvious privacy concerns with the ability to 
capture audio. Bojinov et al. [21] additionally consider 
using accelerometers, which are not considered sensitive 
and do not require a separate permission. Their tech¬ 
niques, however, rely on having the user perform a cali¬ 
bration of the accelerometer (see 6.1), the parameters of 
which are used to distinguish phones. Dey et al. [28] ap¬ 
ply machine learning techniques to create an accelerom¬ 
eter hngerprint, but they require the vibration motor to 
be active to stimulate the accelerometer sensor; in the 
absence of stimulation, they report an average precision 
and recall of only 65%. 
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In contrast, our work studies phones that are in a natu¬ 
ral web-browsing setting, either in a user’s hand or rest¬ 
ing on a flat surface. Additionally, we consider the simul¬ 
taneous use of both accelerometer and gyroscope to pro¬ 
duce a more accurate fingerprint. Inspired by prior work 
that uses the gyroscope to recover audio signals [41 ], we 
also stimulate the gyroscope with an inaudible tone. Fi¬ 
nally, we propose and evaluate several countermeasures 
to reduce fingerprinting accuracy without entirely block¬ 
ing access to the motion sensors. 


electro-mechanical structure induce subtle imperfections 
in accelerometer chips. 
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3 A Closer Look at Motion Sensors 

In this section we briefly take a closer look at motion 
sensors like accelerometer and gyroscope that are em¬ 
bedded in today’s smartphones. This will provide an un¬ 
derstanding of how they can be used to uniquely finger¬ 
print smartphones. Accelerometer and gyroscope sen¬ 
sors in modern smartphones are based on Micro Electro 
Mechanical Systems (MEMS). STMicroelectronics [16] 
and InvenSense [6] are among the top vendors supply¬ 
ing MEMS-based accelerometer and gyroscope sensor to 
different smartphone manufacturers [15]. Traditionally, 
Apple [7,8]* and Samsung [4,5] favor using STMicro¬ 
electronics motion sensors, while Google [13, 14] tends 
to use InvenSense sensors. 

3.1 Accelerometer 

Accelerometer is a device that measures proper acceler¬ 
ation. Proper acceleration is different from coordinate 
acceleration (linear acceleration) as it measures the g- 
force. Eor example, an accelerometer at rest on a surface 
will measure an acceleration of g = 9.8Ims^^ straight 
upwards, while for a free falling object it will measure 
an acceleration of zero. MEMS-based accelerometers 
are based on differential capacitors [10]. Eigure 1 shows 
the internal architecture of a MEMS-based accelerome¬ 
ter. As we can we there are several pairs of fixed elec¬ 
trodes and a movable seismic mass. Under zero force the 
distances d\ and d 2 are equal and as a result the two ca¬ 
pacitors are equal, but a change in force will cause the 
movable seismic mass to shift closer to one of the fixed 
electrodes (i.e., d\ ^ dj) causing a change in the gener¬ 
ated capacitance. This difference in capacitance is de¬ 
tected and amplified to produce a voltage proportional to 
the acceleration. The slightest gap between the structural 
electrodes, introduced during the manufacturing process, 
can cause a change in the capacitance. Also the flexibil¬ 
ity of the seismic mass can be slightly different from one 
chip to another. This form of minute imprecisions in the 

* However, iphone 6 has been reported to use motion sensors pro¬ 
duced by InvenSense. 


Figure 1: Internal architecture of a MEMS accelerometer. Dif¬ 
ferential capacitance is proportional to the applied acceleration. 

3.2 Gyroscope 

Gyroscope measures the rate of rotation (in rads^^) 
along the device’s three axes. MEMS-based gyroscopes 
use the Coriolis effect to measure the angular rate. 
'Whenever an angular velocity of co is exerted on a mov¬ 
ing mass of weight m and velocity v, the object expe¬ 
riences a Coriolis force in a direction perpendicular to 
the rotation axis and to the velocity of the moving object 
(as shown in figure 2). The Coriolis force is calculated 
by the following equation F — 2mv x (O. Generally, the 
angular rate (o) is measured by sensing the magnitude 
of the Coriolis force exerted on a vibrating proof-mass 
within the gyro [1 1,52,54]. The Coriolis force is sensed 
by a capacitive sensing structure where a change in the 
vibration of the proof-mass causes a change in capaci¬ 
tance which is then converted into a voltage signal by 
the internal circuitry. Again the slightest imperfection in 
the electro-mechanical structure will introduce idiosyn¬ 
crasies across chips. 


F Coriolis 



Figure 2: MEMS-based gyros use Coriolis force to compute 
angular velocity. The Coriolis force induces change in capaci¬ 
tance which is proportional to the angular velocity. 

4 Features and Classification Algorithms 

In this section we briefly describe the data pre-processing 
procedure and the features used in generating a device 
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fingerprint. We also discuss the classification algorithms 
and metrics used in our evaluation. 

4.1 Data Preprocessing 

Data from motion sensors can be thought of as a stream 
of timestamped real values. For both accelerometer and 
gyroscope we obtain values along three axes. So, for a 
given timestamp, f, we have two vectors of the following 
form; a{t) — {ax,ay,a^) and (5(f) = {(Ox, (Oy, (0^). The 
accelerometer values include gravity, i.e., when the de¬ 
vice is stationary lying flat on top of a surface we get a 
value of 9.81 mialong the z-axis. We convert the ac¬ 
celeration vector into a scalar by taking its magnitude: 

\a{t)\ = .yjal + Qy + a^. This technique discards some 
information, but has the advantage of making the ac¬ 
celerometer data independent of device orientation; e.g., 
if the device is stationary the acceleration magnitude will 
always be around 9.8 Ims^^, whereas the reading on each 
individual axis will vary greatly (by +/- Ig) depending 
on how the device is held. For the gyroscope we con¬ 
sider data from each axis as a separate stream, since there 
is no corresponding baseline rotational acceleration. In 
other words, if the device is stationary the rotation rate 
across all three axes should be close to 0 ir¬ 

respective of the orientation of the device. Thus, our 
model considers four streams of sensor data in the form 

of {\a{t)\,(0x{t),(0y{t),(0,{t)}. 

For all data streams, we also look at frequency domain 
characteristics. But since the browser, running as one of 
many applications inside the phone, makes API calls to 
collect sensor data the OS might not necessarily respond 
in a synchronized manner^. This results in non-equally 
spaced data points. We, therefore, use cubic-spline in¬ 
terpolation [40] to construct new data points such that 
{\a{t)\,(0x{t),(0y{t),(0^{t)} become equally-spaced. 

4.2 Temporal and Spectral Features 

To summarize the characteristics of a sensor data stream, 
we explore a total of 25 features consisting of 10 tem¬ 
poral and 15 spectral features (listed in Table 1). All of 
these features have been well documented by researchers 
in the past. A detailed description of each feature is avail¬ 
able in Appendix A. 


ing classifier has two main phases: training phase and 
testing phase. During training, features from all smart¬ 
phones (i.e., labeled data) are used to train the classifier. 
In the test phase, the classifier predicts the the most prob¬ 
able class for a given (unseen) feature vector. We eval¬ 
uate the performance of the following supervised clas¬ 
sifiers — Support Vector Machine (SVM), Naive-Bayes 
classifier. Multiclass Decision Tree, k-Nearest Neighbor 
(k-NN), Quadratic Discriminant Analysis (QDA) clas¬ 
sifier and Bagged Decision Trees (Matlab’s Treebagger 
model) [17]. We found that in general ensemble based 
approaches like Bagged Decision Trees outperform the 
other classifiers. We report the maximum achievable 
accuracies from these classifiers in the evaluation Sec¬ 
tion 5. 

Evaluation metrics: For evaluation metric we use stan¬ 
dard multi-class classification metrics like— precision, 
recall, and F-score [53] —in our evaluation. Assuming 
there are n classes, we first compute the true positive 
{TP) rate for each class, i.e., the number of traces from 
the class that are classified correctly. Similarly, we com¬ 
pute the false positive (FP) and false negative {FN) as 
the number of wrongly accepted and wrongly rejected 
traces, respectively, for each class / (1 < / < n). We then 
compute precision, recall, and the F-score for each class 
using the following equations: 

Precision, Pr, = TPi/{TPi+FPi) (1) 

Recall, Rci = TPi/{TPi + FNi) (2) 

F-Score, P,- = (2 x Pr,- x Pe,)/(Pr; +Pe,) (3) 


The F-score is the harmonic mean of precision and re¬ 
call; it provides a good measure of overall classification 
performance, since precision and recall represent a trade¬ 
off: a more conservative classifier that rejects more in¬ 
stances will have higher precision but lower recall, and 
vice-versa. To obtain the overall performance of the sys¬ 
tem we compute average values in the following way: 


Avg. Precision, AvgPr = 
Avg. Recall, AvgRe = 
Avg. F-Score, AvgF = 


n 

l.URe. 

n 

2 X AvgPr X AvgRe 
AvgPrAvgRe 


(4) 

(5) 

( 6 ) 


4.3 Classification Algorithms and Metrics 

Classification Algorithms: Once we have features ex¬ 
tracted from the sensor data, we use supervised learn¬ 
ing to identify the source sensor. Any supervised leam- 

^Depending on the load and other applications running, OS might 
prioritize such API calls differently. 


5 Fingerprinting Evaluation 

In this section we first describe our experimental setup 
(Section 5.1). We then explore features with the aim 
to determine the minimal subset of features required to 
maximize classification accuracy (Section 5.2). Lastly, 
we evaluate our fingerprinting approach under a con¬ 
trolled lab setting (Section 5.3), an uncontrolled real- 
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Table 1: Explored acoustic features 


# 

Domain 

Feature 

Description 

1 


Mean 

The arithmetic mean of the signal strength at different timestamps 

2 


Standard Deviation 

Standard deviation of the signal strength 

3 


Average Deviation 

Average deviation from mean 

4 


Skewness 

Measure of asymmetry about mean 

5 

Time 

Kurtosis 

Measure of the flatness or spikiness of a distribution 

6 

RMS 

Square root of the arithmetic mean of the squares of the signal strength at various timestamps 

7 


Max 

Maximum signal strength 

8 


Min 

Minimum signal strength 

9 


ZCR 

The rate at which the signal changes sign from positive to negative or back 

10 


Non-Negative count 

Number of non-negative values 

11 


Spectral Centroid 

Represents the center of mass of a spectral power distribution 

12 


Spectral Spread 

Defines the dispersion of the spectrum around its centroid 

13 


Spectral Skewness 

Represents the coefficient of skewness of a spectrum 

14 


Spectral Kurtosis 

Measure of the flatness or spikiness of a distribution relative to a normal distribution 

15 


Spectral Entropy 

Captures the peaks of a spectrum and their locations 

16 


Spectral Flatness 

Measures how energy is spread across the spectrum 

17 


Spectral Brightness 

Amount of spectral energy corresponding to frequencies higher than a given cut-off threshold 

18 

Frequency 

Spectral Rolloff 

Defines the frequency below which 85% of the distribution magnitude is concentrated 

19 


Spectral Roughness 

Average of all the dissonance between all possible pairs of peaks in a spectrum 

20 


Spectral Irregularity 

Measures the degree of variation of the successive peaks of a spectrum 

21 


Spectral RMS 

Square root of the arithmetic mean of the squares of the signal strength at various frequencies 

22 


Low-Energy-Rate 

The percentage of frames with RMS power less than the average RMS power for the whole signal 

23 


Spectral flux 

Measure of how quickly the power spectrum of a signal changes 

24 


Spectral Attack Time 

Average rise time to spectral peaks 

25 


Spectral Attack Slope 

Average slope to spectral peaks 


world setting (Section 5.4) and a combination of both 
settings (Section 5.5). 

5.1 Experimental Setup 

Our experimental setup consists of developing our own 
web page to collect sensor data^. We use a simple 
Javascript to access accelerometer and gyroscope data 
(sample code snippet is provided in Appendix C). How¬ 
ever, since we collect data through the browser the maxi¬ 
mum obtainable sampling frequency is lower than avail¬ 
able hardware sampling frequency (restricted by the un¬ 
derlying OS). Table 2 summarizes the sampling frequen¬ 
cies obtained from the top 5 mobile browsers [18]^. We 
use a Samsung Galaxy S3 and iPhone 5 to test the sam¬ 
pling frequency of the different browsers. Table 2 also 
highlights the motion sensors that are accessible from the 
different browsers. We see that Chrome provides the best 
sampling frequency while the default Android browser 
is the most restrictive browser in terms of not only sam¬ 
pling frequency but also access to different motion sen¬ 
sors. However, Chrome being the most popular mobile 
browser [2], we collect data using the Chrome browser. 

We start off our data collection from 30 lab- 
smartphones. Table 3 lists the distribution of the different 
smartphones from which we collect sensor data. Now, as 

^http://datarepo.cs.illinois.edu/SensorFingei'printing.html 

^We computed the average time it took to obtain 100 samples. Sam¬ 
ple website available at http://datarepo.cs.illinois.edu/SamplingFreq. 
html 


Table 2: Sampling frequency from different browsers 


os 

Browser 

Sampling 
Frequency (~Hz) 

Accessible 

Sensors* 

Android 4.4 

Chrome 

100 

A,G 

Android 

20 

A 

Opera 

40 

A,G 

UC Browser 

20 

A,G 

Standalone App [1] 

200 

A,G 

iOS 8.1.3 

Safari 

40 

A,G 

Standalone App [3] 

100 

A,G 


* here ’A’ means accelerometer and ’G’ refers to gyroscope 


gyroscopes react to audio stimulation we collect data un¬ 
der three different background audio settings: no audio, 
an inaudible 20 kHz sine wave, or a popular song play¬ 
ing. In the latter two scenarios, the corresponding audio 
file plays in the background of the browser while data is 
being collected. Under each setting we collect 10 sam¬ 
ples where each sample is about 5 to 8 seconds worth 
of data. Now, since our fingerprinting approach aims to 
capture the inherent imperfections of motion sensors, we 
need to keep the sensors stationary while collecting data. 
Therefore, by default, we have the phone placed flat on 
a surface while data is being collected, unless explicitly 
stated otherwise. We, however, do test our approach for 
the scenario where the user is holding the smartphone in 
his/her hand while sitting down. 

For training and testing the classifiers we randomly 
split the dataset in such a way that 50% of data from 
each device goes to the training set while the remaining 
50% goes to the test set. To prevent any bias in the se¬ 
lection of the training and testing set, we randomize the 
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training and testing set 10 times and report the average 
F-score. We also compute the 95% confidence interval, 
but we found it to be less than 1% and therefore, do not 
report it in the rest of the paper. For analyzing and match¬ 
ing hngerprints we use a desktop machine with an Intel 
i7-2600 3.4GHz processor with 12GiB RAM. We found 
that the average time required to match a new hngerprint 
was around 10-100 ms. 

Table 3: Types of phones used 


Maker 

Model 

Quantity 

Apple 

iPhone 5 

4 

iPhone 5 s 

3 


Nexus S 

14 

Samsung 

Galaxy S3 

4 

Galaxy S4 

5 

Total 

30 


5.2 Feature Exploration and Selection 

At hrst glance, it might seem that using all features at 
our disposal to identify the device is the optimal strategy. 
However, including too many features can worsen per¬ 
formance in practice, due to their varying accuracies and 
potentially-conflicting signatures. We, therefore, explore 
all the features and determine the subset of features that 
optimize our fingerprinting accuracy. For temporal fea¬ 
tures, no transformation of the data stream is required, 
but for spectral features we hrst convert the non-equally 
spaced data stream into a hxed-spaced data stream using 
cubic spline interpolation. We interpolate at a sampling 
rate of 8kHz^. Then, we use the following signal analytic 
tools and modules; MIRtoolbox [12] and Libxtmct [9] 
to extract spectral features. We, next look at feature se¬ 
lection where we explore different combinations of fea¬ 
tures to maximize our hngerprinting accuracy. We use 
the FEAST toolbox [49] and utilize the Joint Mutual In¬ 
formation criterion (JMI criterion is known to provide 
the best tradeoff in terms of accuracy, stability, and hex- 
ibility with small data samples [23]) for ranking the fea¬ 
tures. 

Figure 3 shows the results of our feature exploration 
for the 30 lab-smartphones. We see that when using only 
accelerometer data the F-score seems to Hatten after con¬ 
sidering the top 10 features. For gyroscope data we see 
that using all the 75 features (25 per data stream) obtains 
the best result. And hnally when we combine both ac¬ 
celerometer and gyroscope features, the top 70 features 
(from a total of 100 features) seems to provide the best 
hngerprinting accuracy. Among these top 70 features we 
found that 21 of them came from accelerometer features 

^Although up-sampling the signal from ~ 100 Hz to 8 kHz does not 
increase the accuracy of the signal, it does make direct application of 
standard signal processing tools more convenient. 


and the remaining 59 came from gyroscope features. In 
terms of the distribution between temporal and spectral 
features, we found that spectral features dominated with 
44 of the top 70 features being spectral features. We use 
these subset of features in all our latter evaluations. 

5.3 Results From Lab Setting 

First, we look at hngerprinting smartphones under lab 
environment to demonstrate the basic viability of the at¬ 
tack. For this purpose we keep smartphones stationary 
on top of a hat surface. Figure 4(a) summarizes our re¬ 
sults. We see that we can almost correctly identify all 30 
smartphones under all three scenarios by combining the 
accelerometer and gyroscope features. While the beneht 
of the background audio stimulation is not visible from 
the hgure, we will later on show that audio stimulation 
do in fact enhance hngerprinting accuracy under coun¬ 
termeasure techniques like calibration and obfuscation 
(more in Section 6). Overall these results indicate that 
it is indeed possible to hngerprint smartphones through 
motion sensors. 

5.4 Results From Public Setting 

After gaining promising results from our relatively 
small-scale lab setting, we set out to expand our data col¬ 
lection process to real-world public setting. We invited 
people to voluntarily participate in our study by visit¬ 
ing our web page® and following a few simple steps to 
provide us with sensor data. We recruited participants 
through email and online social networks. We asked par¬ 
ticipants to provide data under two settings: no-audio 
setting and the inaudible sine-wave setting. (We avoid 
the background song to make the experience less bother¬ 
some for the user and more realistic.) Each setting col¬ 
lected sensor data for about one minute, requiring a total 
of two minutes of participation. (We did not ask par¬ 
ticipants to provide data under all three settings because 
it would require more time which could potentially dis¬ 
courage participants to not fully hnish their task.) On 
average, we had around 10 samples per setting per de¬ 
vice. Our data-gathering web page plants a cookie in the 
form of a large random number (acting as a unique ID) 
in the user’s browser, which makes it possible to corre¬ 
late data points coming from the same device. Over the 
course of two weeks we received data from a total of 76 
devices. However, some participants did not follow all 
the steps and as a result we were able to use only 63 of 

^Available at http://datarepo.cs.illinois.edu/ 

DataCollectionHowPlaced.html. Screenshots of our data collec¬ 
tion webpage is provided in Appendix B. We obtained approval from 
our Institutional Research Board (IRB) to perform the data collection. 
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Using accelerometer data only 


Using gyroscope data only 



Figure 3: Exploring the number optimal features for different sensors. For a) accelerometer more than top 10 features leads to 
diminished returns, b) gyroscope all 75 features contribute to obtaining improved accuracy, c) combined sensor data more than 70 
features leads to diminished returns. 
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Gyroscope SSSSSS 
Accelerometer+Gyroscope ■■■■ 



No-audio Sine Song 

Different forms of audio input 


(b) 

Figure 4: Average F-score for different forms of audio stimu¬ 
lation under lab setting. For a) smartphones are kept on top of 
a desk while collecting sensor data, b) smartphones are kept in 
the hand of the user while collecting sensor data. 

the 76 submissions. Figure 5 shows the distribution of 
the different devices that participated in our study. 

Next, we apply our fingerprinting approach on the 
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Figure 5: Distribution of participant device model. 


public data set. Figure 6 shows our findings. Compared 
to the results from our lab setting, we see a slight de¬ 
crease in F-score but even then we were able to obtain an 
F-score of ^ 94%. Again, the benefit of the audio stim¬ 
ulation is not evident from these results, however, their 
benefits will become more visible in the later sections 
when we discuss about countermeasure techniques. 


5.5 Results From Combined Setting 

Finally, we combine our lab data with the publicly col¬ 
lected data to give us a combined dataset containing 93 
different smartphones. We apply the same set of evalu¬ 
ations on this combined dataset. Figure 7 highlights our 
findings. Again, we see that combining features from 
both sensors provides the best result. In this case we 
obtained an F-score of ^ 96%. All these results sug¬ 
gest that smartphones can be successfully fingerprinted 
through motion sensors. 


5.6 Sensitivity Analysis 

5.6.1 Varying the Number of Devices 

We evaluate the accuracy of our classifier while varying 
the number of devices. We pick a subset of n devices in 
our data set and perform the training and testing steps for 
this subset. For each value of n, we repeat the experiment 
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Figure 6: Average F-score for different forms of audio stimula¬ 
tion under public setting. Results obtained for 63 public smart¬ 
phones where users were told to keep the smartphone on top of 
a desk while collecting sensor data. 
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Figure 7: Average F-score for different forms of audio stimula¬ 
tion. Results are obtained hy combining the publicly collected 
data with our lab data giving us a total of 93 devices. All the 
smartphones were kept on top of a desk while collecting sensor 
data. 

10 times, using a different random subset of n devices 
each time. In this experiment we consider the use of 
both accelerometer and gyroscope features, since those 
produce the best performance, and focus on the no au¬ 
dio and sine wave background scenarios. Figure 8 shows 
that the F-score generally decreases with large number 
of devices, which is expected as an increased number of 
labels makes classification more difficult. Extrapolating 
from the graph, we expect classification to remain accu¬ 
rate even for significantly larger data sets. 
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Figure 8: Average F-score for different numbers of smart¬ 
phones. F-score generally tends to decrease slightly as more 
devices are considered. 


we vary the ratio of training and testing set size. For 
this experiment we only look at data from our lab set¬ 
ting as some of the devices from our public setting did 
not have exactly 10 samples. We also consider the set¬ 
ting where there is no background audio stimulation and 
use the combined features of accelerometer and gyro¬ 
scope. Figure 9 shows our findings. While an increased 
training size improves classification accuracy, even with 
mere two training samples (of a few seconds each) are 
sufficient to achieve an F-score of ~ 98, with increased 
training set sizes producing an F-score of over 99%. 


Using both accelerometer and gyroscope data 



Figure 9: Average F-score for different ratio of training and 
test data. With only two training data we achieved a F-score of 
~98% 


5.6.2 Varying Training Set Size 

We also consider how varying the training set size im¬ 
pacts the fingerprinting accuracy. For this experiment 
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6 Countermeasures 

So far we have focused on showing how easy it is to fin¬ 
gerprint smartphones through motion sensors. We now 
shift our focus on providing a systematic approach to de¬ 
fending against such fingerprinting techniques. We pro¬ 
pose two approaches: sensor calibration and data obfus¬ 
cation. 


6.1 Calibration 


Bojinov et al. [21] observe that their phones have cal¬ 
ibration errors, and use these calibration differences as 
a mechanism to distinguish between them. In particu¬ 
lar, they consider an affine error model: = g - a + o, 

where a is the true acceleration along an axis and is 
the measured value of the sensor. The two error param¬ 
eters are the offset o (bias away from 0) and the gain 
g which magnifies or diminishes the acceleration value. 
Our classification uses many features, but we find that 
the mean signal value is the most discriminating feature 
for each of the sensor streams, which is closely related to 
the offset. We therefore explore whether calibrating the 
sensors will make them more difficult to fingerprint. We 
note that calibration has a side effect of improving the ac¬ 
curacy of sensor readings and is therefore of independent 
value. We perform the calibration only on the sensors in 
our 30 lab smartphones because we felt that calibration 
is too time consuming for the volunteers. Moreover, we 
could better control the quality of the calibration process 
when carried out in the lab. 

First, let us briefly describe the sensor coordinate sys¬ 
tem as the sensor framework using a standard 3-axis co¬ 
ordinate system to express data values. For most sen¬ 
sors, the coordinate system is defined relative to the de¬ 
vice’s screen when the device is held in its default ori¬ 
entation (shown in figure 10). When the device is held 
in its default orientation, the positive ;ic-axis is horizon¬ 
tal and points to the right, the positive y-axis is vertical 
and points up, and the positive z-axis points toward the 
outside of the screen face^. We compute offset and gain 
error in all three axes. 

Calibrating the Accelerometer: Considering both off¬ 
set and gain error, the measured output of the accelerom¬ 
eter (a^ = [a^,af]) can be expressed as: 
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where S = and O = [0x,0y,0^ respectively 

represents the gain and offset errors along all three axes 

^Android and iOS consider the positive and negative direction along 
an axis differently. 


{a = [ax,ay,a^] refers to the actual acceleration). In 
the ideal world = [1,1,1] and [0x,0y,0^ — 

[0,0,0], but in reality they differ from the desired values. 
To compute the offset and gain error of an axis, we need 
data along both the positive and negative direction of that 
axis (one measures positive +g while the other measures 
negative —g). In other words, six different static posi¬ 
tions are used where in each position one of the axes is 
aligned either along or opposite to earth’s gravity. This 
causes the a = [ax, fly, a^] vector to take one of the follow¬ 
ing six possible values {[±g,0,0], [0,±g,0], [0,0,±g]}. 
For example, if and a^_ are two values of accelerom¬ 
eter reading along the positive and negative z-axis, then 
we can compute the offset (O^) and gain (S^) error using 
the following equation: 


a" -fl" 

g “z- 


2^ 


Ox= 


( 8 ) 


We take 10 measurements along all six directions 
(±x, ±y, ±z) from all our lab devices as shown in Fig¬ 
ure 10. From these measurements we compute the av¬ 
erage offset and gain error along all three axes using 
equation (8). Figure 11 shows a scatter-plot of the er¬ 
rors along z — axis for 30 smartphones (each point rep¬ 
resents a single device). We can see that the devices are 
scattered around allover the plot which signifies that dif¬ 
ferent devices have different amount of offset and gain 
error. Such unique distinction makes fingerprinting fea¬ 
sible. 
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Figure 11: Accelerometer offset and gain error of from 30 
smartphones. 


Calibrating tbe Gyroscope: Calibrating gyroscope is 
a harder problem as we need to induce a fixed angular 
change to determine the offset and gain error. Similar to 
accelerometer we can also represent the measured output 
of the gyroscope (co^ = [(O^, (of, (0^]) using the follow¬ 
ing equation: 
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Figure 10: Calibrating accelerometer along three axes. We collect measurements along all 6 directions (±jc, iky, ±z). 


where again S = and O = [0x,0y,0^] respec¬ 

tively represents the gain and offset errors along all three 
axes. Here, (O — [Ox, (dy, Wj]) represents the ideal/actual 
angular velocity. Ideally all gain and offset errors should 
be equal to 1 and 0 respectively. But in the real world 
when the device is rotated by a fixed amount of angle, 
the measured angle tends to deviate from the actual an¬ 
gular displacement (shown in figure 12(a)). This impacts 
any system that uses gyroscope for angular-displacement 
measurements. 

Measured rotation = (x 
;l‘ Actual rotation = CX 

// /f 

: '' V 

//'" “ '\ 

Smartphone 



(a) 



Figure 12: a) Offset and gain error in gyroscope impact sys¬ 
tems that use them for angular displacement measurements, b) 
Calibrating the gyroscope by rotating the device by 180° in the 
positive x-axis direction. 

To calibrate gyroscope we again need to collect data 
along all six different directions (±x,±y, ±z) individu¬ 
ally, but this time instead of keeping the device station¬ 
ary we need to rotate the device by a fixed amount of 
angle (0). In our setting, we set 0 = 180° (or n rad). 
For example. Figure 12(b) shows how we rotate the the 
smartphone by 180° around the positive x-axis. The an¬ 
gular displacement along any direction can be computed 


from gyroscope data in the following manner: 

(of = Oi -k SiCO, i e {±x, ±y, ±z} 


(ofdt= I Oidt + Si I codt 


f 


= 0 it+SiO 


( 10 ) 


where t refers to the time it took to rotate the device by 
0 angle with a fixed angular velocity of co. Now, for 
any two measurements along the opposite directions of 
an axis we can compute the offset and gain error using 
the following equation: 


Oi = 
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( 11 ) 


where i G {x,y,z\ and t\ and t 2 represents the times- 
pan of the positive and negative measurement respec¬ 
tively. We take 10 measurements along all six directions 
(±x, ±y, ±z) and compute the average offset and gain er¬ 
ror along all three axes. However, since its practically 
impossible to manually rotate the device a fixed angular 
velocity, the integration in equation (10) will introduce 
noise and therefore, the calculated errors will at best be 
approximations of the real errors. We also approximate 
the integral using trapezoidal rule which will introduce 
some more errors. 

We next visualize the offset and gain error obtained 
from the gyroscopes of 30 smartphones (only showing 
for z-axis). Figure 13 shows our findings. We see simi¬ 
lar results compared to accelerometers where devices are 
scattered around at different regions of the plot. This 
suggests that gyroscopes exhibit different range of offset 
of and gain error across different units. 

Fingerprinting Calibrated Data: In this section we 
look at how calibrating sensors impacts the fingerprint¬ 
ing accuracy. For this setting, we first correct the raw 
values by removing the the offset and gain errors be¬ 
fore extracting features from them. That is, the calibrated 
value fl*' = /g — o. We then generate fingerprints on 
the corrected data and train the classifiers on the new 
fingerprints. Figure 14 shows the average F-score for 
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Figure 13: Gyroscope offset and gain error of 30 smartphones. 

calibrated data under three scenarios, considering both 
cases where the devices were kept on top of desk and 
in the hand of a user. When we compare the results 
from uncalibrated data (figure 4) to those from calibrated 
data, we see that the F-score reduces by almost 30% for 
accelerometer data but not as much for the gyroscope 
data. This suggests that we were able to calibrate the ac¬ 
celerometer much more precisely than the gyroscope, as 
expected given the more complex and error-prone man¬ 
ual calibration procedure for the gyroscope. Another in¬ 
teresting observation is that audio stimulation provides 
a significant improvement in classifier accuracy. This 
suggests that audio stimulation does not influence (and 
perhaps even hinders) the dominant features removed by 
the calibration, but does significantly impact secondary 
features that come into play once calibration is carried 
out. Overall, our results demonstrate that calibration is 
a promising technique, especially if more precise mea¬ 
surements can be made. Manufacturers should be en¬ 
couraged to perform better calibration to both improve 
the accuracy of their sensors and to help protect users’ 
privacy. 



6.2 Data Obfuscation 

Basic Obfuscation: Rather than remove the calibration 
errors, we can instead add extra noise to hide the calibra¬ 
tion. This approach has the advantage of not requiring a 
calibration step, which requires user intervention and is 
particularly difficult for the gyroscope sensors. As such, 
the obfuscation technique could be deployed with an op¬ 
erating system update. Obfuscation, however, adds extra 
noise and can therefore negatively impact the utility of 
the sensors (in contrast to calibration, which improves 
their utility). We therefore first consider small obfusca¬ 
tion values in the range that is similar to what we ob¬ 
served in the calibration errors above. Adding noise in 
this range is roughly equivalent to switching to a differ¬ 
ently (mis)calibrated phone and therefore should cause 
minimal impact to the user. 
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Figure 14: Average F-score for calibrated data under lab set¬ 
ting. Results obtained from 30 smartphones where the smart¬ 
phones were kept a) on top of a desk b) in the hand of the user 
while collecting sensor data. 

To add obfuscation noise, we compute — a^ -\- 
where and are the obfuscation gain and offset, 
respectively. Based on Figures 11 and 13, we choose a 
range of [-0.5,0.5] for the accelerometer offset, [-0.1,0.1] 
for the gyroscope offset, and [0.95,1.05] for the gain. 
For each session, we pick uniformly random obfusca¬ 
tion gain and offset values from the range; by varying 
the obfuscation values we make it difficult to fingerprint 
repeated visits. Figure 15 summarizes our findings when 
we apply obfuscation to all the sensor data obtained from 
our 30 lab smartphones. Compared to unaltered data 
(figure 4), data obfuscation seems to provide significant 
improvement in terms of reducing the average F-score. 
Depending on the type of audio stimulation F-score re¬ 
duces by almost 10-25% when smartphones are kept sta¬ 
tionary on the desk and by 20^5% when smartphones 
are kept stationary in the hand of the user. The impact 
of audio stimulation in fingerprinting motion sensors is 
much more visible in these results. We see that F-score 
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increases by almost 15% when a song is being played 
in the background; again, we expect this to be a conse¬ 
quence of us having hidden the calibration errors that are 
the primary discriminant between phones. 
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Figure 15: Average F-score for obfuscated data under lab set¬ 
ting. Results obtained from 30 smartphones where the smart¬ 
phones were kept a) on top of a desk b) in the hand of the user 
while collecting sensor data. 

Next, we apply similar techniques to the public and 
combined dataset. We apply the same range of offset and 
gain errors to the raw values before generating finger¬ 
prints. Figure 16 summarizes our results for both pres¬ 
ence and absence of audio stimulation. We see that F- 
score reduces by approximately 20^0% (Figure 14(a)). 
We expect the lower accuracy is a consequence of a 
larger data set, suggesting that for even larger sets the im¬ 
pact of obfuscation is likely to be even more pronounced. 

Increasing the Obfuscation Range: We next look 
at how the fingerprinting technique reacts to different 
ranges of obfuscation. Starting with our base ranges 
of [—0.5,0.5] and [—0.1,0.!] for the accelerometer and 
gyroscope offsets, respectively, and [0.95,1.05] for the 
gain, we linearly scale the ranges and observe the im- 
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Figure 16: Average F-score for different forms of audio stim¬ 
ulation for obfuscated data. Results obtained from a) 63 public 
smartphones b) 93 smartphones (by combining the 63 public 
smartphones with our 30 lab phones) where the smartphones 
were kept on top of a desk while collecting sensor data. 

pact on the average F-score. We scale all ranges by the 
same amount, increasing the ranges symmetrically on 
both sides of the interval midpoint. 

For this experimental setup we only consider the com¬ 
bined dataset as this contains the most number of devices 
(93 in total). We also restrict ourselves to the setting 
where we combine both the accelerometer and gyroscope 
features because this provides the optimal result (as evi¬ 
dent from all our past results). Figure 17 highlights our 
findings. As we can see increasing the obfuscation range 
does reduce F-score but it has a diminishing return. For 
lOx increment, the F-score drops down to approximately 
40% and 55% for no-audio and audio stimulation respec¬ 
tively. Beyond lOx increment (not shown) the reduction 
in F-score is minimal (at most 10% reduction at 50x in¬ 
crement). This result suggests that simply obfuscating 
the raw values is not sufficient to hide all unique charac¬ 
teristics of the sensors. So far we have only manipulated 
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the signal value but did not alter any of the frequency 
features and as a result the classifier is still able to utilize 
the spectral features to uniquely distinguish individual 
devices. 


Using both accelerometer and gyroscope data 



Figure 17: Impact of obfuscation range as the range is linearly 
scaled up from lx to lOx of the base range. 

Enhanced Obfuscation: Given that we know that the 
spectral features are not impacted my our obfuscation 
techniques, we now focus on adding noise to the fre¬ 
quency of the sensor signal. Our data injection procedure 
is described in algorithm 1. The main idea is to proba¬ 
bilistically insert a modified version of the current data 
point in between the past and current timestamp where 
the timestamp itself is randomly selected. Doing so will 
influence cubic interpolation of the data stream which in 
turn will impact the spectral features extracted from the 
data stream. 

Algorithm 1 Obfuscated Data Injection 

Input: Time series Data D[t], Probability Pr, 

Obfuscation Range Offset O, Gain S 

Output: Modified Data Stream MD[f] 

IciStiimestamp ^ Null 
offset <— Null 
gain ^r- Null 

#Random(range) : randomly selects a value in range 
for 1 = 1 to length{D) do 
#New data insertion 
if / > 1 and Random{\p, 1]) < Pr then 
offset ^ Random{Obfi.^„gg) 
gain ^ Random{Obf 
time ^ Random{[i,last,imestamp\) 

D[time\ -l— InsertData{D\i\,offset,gain) 
end if 

#Original Data 

D[i\ -It- InsertData{D[i\,0,S) 

lastfiffi^stamp f I 

end for 
return MD 


To evaluate our approach we first fix a obfuscation 
range. We choose lOx of the base range from the previ¬ 


ous section as our fixed obfuscation range. We then vary 
the probability of data injection from [0,1]. Figure 18 
shows our findings. We can see that even with relatively 
small amount of data injection (< 0.4) we can reduce the 
average F-score to «15-20% depending on what type of 
input stimulation is applied. 


Using both accelerometer and gyroseope data 



Probability of injeeting new data samples 
Figure 18: Impact of randomly inserting new data points. 


7 Deployment Considerations 

We envision our obfuscation technique as a middle-ware, 
sitting between the OS and user application. Under de¬ 
fault setting data is always obfuscated unless the user ex¬ 
plicitly allows an application to access unaltered sensor 
data. For example, a 3-D game might need access to raw 
accelerometer and gyroscope data instead of the obfus¬ 
cated data to operate properly, in which case this will be 
noticeable to the user who can then provide the appropri¬ 
ate permission to the application. Our default obfuscated 
setting will ensure that users do not have to worry about 
applications like browser accessing sensor data without 
their awareness. 

8 Conclusion 

In this paper we show that motion sensors such as ac¬ 
celerometers and gyroscopes can be used to uniquely 
identify smartphones. The more concerning matter is 
that these sensors can be surreptitiously accessed from 
the browser without user awareness. We also show that 
injecting audio stimulation in the background improves 
detection rate as sensors like gyroscopes react to acoustic 
stimulation differently. 

Our countermeasure techniques, however, mitigate 
such threats by obfuscating anomalies in sensor data. We 
were able to significantly reduce fingerprinting accuracy 
by employing a simple, yet effective obfuscation tech¬ 
nique that injects random data points inside the generated 
sensor data-stream. As a general conclusion we suggest 
using our obfuscation technique in the absence of explicit 
user permission/awareness. 
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A Feature Description 

Mean Signal Value: This feature computes the arith¬ 
metic mean of a signal amplitude. In the case of a set of 
N values {xi ,xn}, the mean value is given by the 

following formula: 


11 = ^{xi+X2-\ - \-xn) (12) 

The mean value provides an approximation of the aver¬ 
age signal strength. 

Signal Variance: This feature computes the dispersion 
in signal strength. For a set of N values {xi^X2^ ■ ■ ■ ,xn}, 
the standard deviation is given by the following formula: 

]J (13) 

where ji refers to the mean signal strength. Variance 
measures the spread of a signal strength. 

Average Deviation: This feature measures the average 
distance from mean. In the case of a set of N values 
{xi ,X 2 , ■ ■ ■ ,xn}, the average deviation is computes using 
the following formula: 


1 ^ 

AvgDev=(14) 


where /i refers to the mean signal strength. 

Skewness: This feature measures asymmetry about 

mean. For a set of N values {xi,X 2 ,... ,xn}, the skew¬ 
ness is computed as: 




(15) 
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where fi and a respectively represents the mean and 
standard deviation of signal strength. 

Kurtosis: This feature measures the flatness or 

spikiness of a distribution. For a set of N values 
{xi,X 2 , ■ ■ ■ ,xn}, the kurtosis is computed as: 

where ji and a respectively represents the mean and 
standard deviation of signal strength. 
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Root-Mean-Square (RMS) Energy: This feature com¬ 
putes the square root of the arithmetic mean of the 
squares of the original audio signal strength at vari¬ 
ous frequencies. In the case of a set of N values 
{xi^X 2 t ■ ■ ,xn}, the RMS value is given by the follow¬ 
ing formula: 


The central idea of using entropy as a feature is to capture 
the peaks of the spectrum and their location. 

Spectral Spread: Spectral spread defines the dispersion 
of the spectrum around its centroid, i.e., it measures the 
standard deviation of the spectral distribution. So it can 
be computed as: 


■^rms — 



{x\+xl + ---+xl) 


(17) 


The RMS value provides an approximation of the aver¬ 
age audio signal strength. 

Zero Crossing Rate (ZCR): The zero-crossing rate is 
the rate at which the signal changes sign from positive to 
negative or back [24]. ZCR for a signal s of length T can 
be defined as: 

1 ^ 

ZCR=-Y^\s{t)-s{t-\)\ (18) 

^ t=\ 


where s(t) = 1 if the signal has a positive amplitude at 
time t and 0 otherwise. Zero-crossing rates provide a 
measure of the noisiness of the signal. 

Low Energy Rate: The low energy rate computes the 
percentage of frames (typically 50ms chunks) with RMS 
power less than the average RMS power for the whole 
signal. 

Spectral Centroid: The spectral centroid represents 
the “center of mass” of a spectral power distribution. It 
is calculated as the weighted mean of the frequencies 
present in the signal, determined using a fourier trans¬ 
form, with their magnitudes as the weights: 

-f 

Cent raid, H = '~A ' - - (19) 


where m, represents the magnitude of bin number i, and 
fi represents the center frequency of that bin. 

Spectral Entropy: Spectral entropy captures the spik¬ 
iness of a spectral distribution. To compute spectral en¬ 
tropy, a Digital Fourier Transform (DFT) of the signal is 
first carried out. Next, the frequency spectrum is con¬ 
verted into a probability mass function (PMF) by nor¬ 
malizing the spectrum using the following equation: 


Wi = 



( 20 ) 


where m,- represents the energy/magnitude of the i- 
th frequency component of the spectrum. w = 
(wi, W 2 ,..., wn) is the PMF of the spectrum and N is the 
number of points in the spectrum. This PMF can then be 
used to compute the spectral entropy using the following 
equation: 

N 

( 21 ) 

;=1 


Spread, a 




■WA 


i=\ 


( 22 ) 


where w, represents the weight of the /-th frequency com¬ 
ponent obtained from equation (20) and ji represents the 
centroid of the spectrum obtained from equation (19). 

Spectral Skewness: Spectral skewness computes the 
coefficient of skewness of a spectrum. Skewness (third 
central moment) measures the symmetry of the distribu¬ 
tion. A distribution can be positively skewed in which 
case it has a long tail to the right while a negatively- 
skewed distribution has a longer tail to the left. A sym¬ 
metrical distribution has a skewness of zero. The coef¬ 
ficient of skewness is the ratio of the skewness to the 
standard deviation raised to the third power. 


Skewness = 


rr3 


(23) 


Spectral Kurtosis: Spectral Kurtosis gives a measure 
of the flatness or spikiness of a distribution relative to a 
normal distribution. It is computed from the fourth cen¬ 
tral moment using the following function: 


Kurtosis = 


III 

(j4 


(24) 


A kurtosis value of 3 means the distribution is similar 
to a normal distribution whereas values less than 3 refer 
to flatter distributions and values greater than 3 refers to 
steeper distributions. 

Spectral Flatness: Spectral flatness measures how en¬ 
ergy is spread across the spectrum, giving a high value 
when energy is equally distributed and a low value when 
energy is concentrated in a small number of narrow fre¬ 
quency bands. The spectral flatness is calculated by di¬ 
viding the geometric mean of the power spectrum by the 
arithmetic mean of the power spectrum [34]: 


Flatness = 


[nf=i' 


1 


N^i= 


(25) 


where m,- represents the magnitude of bin number i. One 
advantage of using spectral flatness is that it is not af¬ 
fected by the amplitude of the signal. 

Spectral Brightness: Spectral brightness calculates the 
amount of spectral energy corresponding to frequencies 
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higher than a given cut-off threshold. Spectral brightness 
can be computed using the following equation; 

N 

Brightness ^ m,- (26) 

i=fc 

where fc is the cut-off frequency (set to 1500Hz) and m, 
is the magnitude of the i-th frequency component of the 
spectrum. 

Spectral Rolloff: The spectral rolloff is defined as the 
frequency below which 85% of the distribution magni¬ 
tude is concentrated [55] 


Collect Accelerometer and 
Gyroscope Readings 


Collect Accelerometer and 
Gyroscope Readings 


Your participation is voluntary and 
anonymous, we therefore, request participants 
to participate sincerely. Device should be 
headphones free. Please provide data under 
both settings in the next page. 

Where is your phone? 

Please make sure the phone is on a flat 
surface like a desk or table before you 
proceed to the next step. 


c 


Proceed 


3 


c 


Go to instruction page 


D 


10 samples will be taken sequentially, taking 
-1 minute to complete. Once sampling starts 
the buttons become disabled until 10 samples 
are taken. 


c 

c 


Raw Data 


Sine Wave (20kHz) 


) 


c 


Goto instruction page 


3 


fc N 

argmin ^ m, > 0.85 • ^ m,- (27) 

/ce{i,...,w},=i ,=i 


fl 3 < > CP 3 


Figure 19: Screenshot of our data collection website. 


where fc is the rolloff frequency and m, is the magnitude 
of the /-th frequency component of the spectrum. 

Spectral Irregularity: Spectral irregularity measures 
the degree of variation of the successive peaks of a spec¬ 
trum. This feature provides the ability to capture the jitter 
or noise in spectrum. Spectral irregularity is computed as 
the sum of the square of the difference in amplitude be¬ 
tween adjoining spectral peaks [33] using the following 
equation; 


Irregularity = ^ (28) 

where the {N + l)-th peak is assumed to be zero. A 
change in irregularity changes the perceived timbre of 
a sound. 

Spectral Flux: Spectral flux is a measure of how 
quickly the power spectrum of a signal changes. It is 
calculated by taking the average Euclidean distance be¬ 
tween the power spectrum of two contiguous frames. 

Spectral Attack Time: This features computes the av¬ 
erage rise time to spectral attacks where spectral attacks 
are local maxima in the spectrum [37]. 

Spectral Attack Slope: This features computes the av¬ 
erage slope to spectral attacks where spectral attacks are 
local maxima in the spectrum [37]. 


C Accessing Motion Sensors Front 
Browser 

To access motion sensors the DeviceMotion class needs 
to be initialized. A sample JavaScript snippet is given 
below; 

if(window.DeviceMotionEvent!=undefined){ 

window.addEventListener('devicemotion ', motionHandler) 
window.ondevicemotion = motionHandler; 

} 

function motionHandler(event) { 

agx = event.accelerationlncludingGravity.x; 
agy = event.accelerationlncludingGravity.y; 
agz = event.accelerationlncludingGravity.z; 
ai = event.interval; 
rR = event.rotationRate; 
if (rR != null) { 

arAlpha = rR.alpha; 
arBeta = rR.beta ; 
arGamma = rR.gamma; 



B Screenshot of Our Data Collection Web¬ 
page 


We provide screenshots (see figure 19) of our data col¬ 
lection website to give a better idea of how participants 
were asked to participate. 
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